Computer-Aided Lung Nodule Recognition by SVM Classifier Based on Combination of Random Undersampling and SMOTE
نویسندگان
چکیده
In lung cancer computer-aided detection/diagnosis (CAD) systems, classification of regions of interest (ROI) is often used to detect/diagnose lung nodule accurately. However, problems of unbalanced datasets often have detrimental effects on the performance of classification. In this paper, both minority and majority classes are resampled to increase the generalization ability. We propose a novel SVM classifier combined with random undersampling (RU) and SMOTE for lung nodule recognition. The combinations of the two resampling methods not only achieve a balanced training samples but also remove noise and duplicate information in the training sample and retain useful information to improve the effective data utilization, hence improving performance of SVM algorithm for pulmonary nodules classification under the unbalanced data. Eight features including 2D and 3D features are extracted for training and classification. Experimental results show that for different sizes of training datasets our RU-SMOTE-SVM classifier gets the highest classification accuracy among the four kinds of classifiers, and the average classification accuracy is more than 92.94%.
منابع مشابه
Improvement of Chemical Named Entity Recognition through Sentence-based Random Under-sampling and Classifier Combination
Chemical Named Entity Recognition (NER) is the basic step for consequent information extraction tasks such as named entity resolution, drug-drug interaction discovery, extraction of the names of the molecules and their properties. Improvement in the performance of such systems may affects the quality of the subsequent tasks. Chemical text from which data for named entity recognition is extracte...
متن کاملA New Computer-Aided Detection System for Pulmonary Nodule in CT Scan Images of Cancerous Patients
Introduction: In the lung cancers, a computer-aided detection system that is capable of detecting very small glands in high volume of CT images is very useful.This study provided a novelsystem for detection of pulmonary nodules in CT image. Methods: In a case-control study, CT scans of the chest of 20 patients referred to Yazd Social Security Hospital were examined. In the two-dimensional and ...
متن کاملImproved Sampling Techniques for Learning an Imbalanced Data Set
This paper presents the performance of a classifier built using the stackingC algorithm in nine different data sets. Each data set is generated using a sampling technique applied on the original imbalanced data set. Five new sampling techniques are proposed in this paper (i.e., SMOTERandRep, Lax Random Oversampling, Lax Random Undersampling, Combined-Lax Random Oversampling Undersampling, and C...
متن کاملPredicting credit card customer churn in banks using data mining
In this paper, we solve the customer credit card churn prediction via data mining. We developed an ensemble system incorporating majority voting and involving Multilayer Perceptron (MLP), Logistic Regression (LR), decision trees (J48), Random Forest (RF), Radial Basis Function (RBF) network and Support Vector Machine (SVM) as the constituents. The dataset was taken from the Business Intelligenc...
متن کاملA Computer Aided Pulmonary Nodule Detection System Using Multiple Massive Training SVMs
A computer aided pulmonary nodule detection system for chest radiography is proposed. The system consists of three models, viz., lung segmentation, lung nodule candidates detection and false positive reduction. Several innovations are offered in this system. The first one is that the detection of potential lung nodule candidates is conceived as a filtering process that searches for any region w...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 2015 شماره
صفحات -
تاریخ انتشار 2015